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HIGHLIGHTS 


e Strategy of tuning gene expression ratio from perspective of noise and correlation is discussed. 

e Analytical results of genes expression noise and correlation for the two-state and the constitutive promoter are given. 

e Noises and correlation of gene expressions in operon with the two-state promoter are higher than in operon with the constitutive promoter. 
e Polar effect obviously lowers the correlation between genes in operons. 

e Tuning translation rates is the optimal strategy to tune gene expression ratio in operon. 
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Genes are organized into operons in procaryote, and these genes in one operon generally have related 
functions. However, genes in the same operon are usually not equally expressed, and the ratio needs to 
be fine-tuned for specific functions. We examine the difference of gene expression noise and correlation 
when tuning the expression level at the transcriptional or translational level in a bicistronic operon 
driven by a constitutive or a two-state promoter. We get analytic results for the noise and correlation of 
gene expression levels, which is confirmed by our stochastic simulations. Both the noise and the 
correlation of gene expressions in an operon with a two-state promoter are higher than in an operon 
with a constitutive promoter. Premature termination of mRNA induced by transcription terminator in 
the intergenic region or changing translation rates can tune the protein ratio at the transcriptional level 
or at the translational level. We find that gene expression correlation between promoter-proximal and 
promoter-distal genes at the protein level decreases as termination increases. In contrast, changing 
translation rates in the normal range almost does not alter the correlation. This explains why the 
translation rate is a key factor of modulating gene expressions in an operon. Our results can be useful to 
understand the relationship between the operon structure and the biological function of a gene network, 
and also may help in synthetic biology design. 

© 2014 Elsevier Ltd. All rights reserved. 


1. Introduction 


Gene expression is noisy, which can influence its function 
(Eldar and Elowitz, 2010; Hao et al., 2011a; Paulsson et al., 2000; 


An operon is a unit of genomic DNA containing a cluster of 
genes under the control of a single promoter (Jacob and Monod, 
1961). Genes in an operon generally belong to the same function 
class (Salgado and Moreno-Hagelsieb, 2000; Dandekar et al., 1998), 
and they are transcribed together into a mRNA strand. Their 
expressions can be regulated coordinately, which is one of the 
reasons that genes are organized into operons (Price et al., 2005). 
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Ray and Igoshin, 2012; Chalancon et al., 2012). Noise from the 
stochasticity of transcription and translation process is recognized 
as the intrinsic noise. Noise introduced by other factors such as 
different concentrations of RNA polymerase, ribosome or tran- 
scription factors is recognized as the extrinsic noise. The extrinsic 
noise is relatively slow in the same environment (Taniguchi et al., 
2010; Rosenfeld et al., 2005). Intrinsic and extrinsic noises toge- 
ther determine the gene expression profile (Taniguchi et al., 2010; 
Rosenfeld et al., 2005; Paulsson, 2004). 

Proteins work together to execute specific functions in the cell. 
Different components in the same pathway need to be coordinated 
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to get the robust result. For example, protein complex formation is 
most efficient when subunits of a protein complex are expressed 
with the needed ratio (Carmi et al., 2006, 2009; Swain, 2004). 
Expressions of enzymes in a metabolism pathway usually need to 
be correlated to avoid intermediate product cumulation (Ray and 
Igoshin, 2012) which may be toxic. Experiment shows that gene 
expressions in the same operon are more correlated together than 
those in the separated operons even with identical promoters, 
especially when gene expression levels are low (Tabor et al., 2008). 
The strengthened correlation can be explained by co-transcription 
(Sneppen et al., 2010; Swain, 2004; Ray and Igoshin, 2012; Iber, 
2006) or in some cases also by co-translation (Swain, 2004; Iber, 
2006). Statistics from Escherichia coli genomics indicate that for 
genes with products in linear metabolic pathways, with physical 
interactions and in covalent modification modules, they are more 
likely to be organized into operons (Ray and Igoshin, 2012). Noise 
of the genes expression from one operon can be further attenuated 
by different mechanisms, for example, transcriptional or transla- 
tional feedback (Swain, 2004). Sometimes subunits of macromo- 
lecular complex are produced from different operons. Swain 
(2004) shows that both translational and transcriptional feedback 
can increase the correlation between these subunits, but transla- 
tional feedback is more efficient. 

However, even in the same operon, genes are not expressed at 
the equivalent level. The cell needs to fine-tune the gene expres- 
sion ratio to fit their biological functions. The tuning can be 
executed at the transcriptional or the translational level. In this 
work, we try to study the difference of noise and correlation when 
the cell takes distinct tuning strategies. 

Tuning transcription rate or stability of mRNA can change the 
MRNA level, which will be passed to the protein levels by 
translation. Lifetime of mRNAs in bacteria is relatively short. Most 
mRNAs will be degraded by various ribonucleases (Condon, 2007) 
and live only several minutes (Selinger et al., 2003; Bernstein et al., 
2002; Kristoffersen et al., 2012). Different transcription rates can 
be introduced by the internal promoter or terminator. Intergenic 
Rho-dependent or Rho-independent (Lesnik et al., 2001) termi- 
nator can introduce the polar effect which means genes in the 
promoter-proximal region are transcribed more compared with 
the promoter-distal ones. Recently, transcriptome sequencing of 
Mycoplasma pneumoniae (Giiell et al., 2009) and synthetic biology 
studies in E. coli (Nishizaki et al., 2007; Hiroe et al., 2012) indicate 
that this effect might be a global phenomenon even for an operon 
without an obvious termination signal in the intergenic region. As 
for mRNA degradation rate, although distinct regions in one 
operon may have different degradation rates (Selinger et al., 
2003; Kristoffersen et al., 2012), for those genes with close 
function relationship, their transcripts seem to have the similar 
degradation rates (Bernstein et al., 2002; Kristoffersen et al., 2012; 
Quax et al., 2013). Previous works indicate that co-transcription of 
genes in one operon is the main source of correlation (Sneppen et 
al., 2010; Swain, 2004; Ray and Igoshin, 2012; Iber, 2006), so it can 
be expected that tuning gene expression ratio at transcriptional 
level would lower the correlation. 

Although it cannot be ruled out that tuning translation degra- 
dation would be one of the mechanisms to adjust the protein 
expression level, it is well known that most proteins in the 
prokaryotic cell are not degraded to avoid wasting (Gur et al., 
2011). It is more likely that cells tune a protein level by adjusting 
protein synthesis rate. Many features embedded in coding 
and non-coding regions of the mRNA sequence can influence the 
translation initiation and elongation rate. The Shine-Dalgarno 
sequence and its distance to translation initiation codon will 
affect the assembling of translation complex (Vellanoweth and 
Rabinowitz, 1992; Chen et al., 1994). The structure of 5’-UTR will 
affect the accessibility of the ribosome binding site (Goodman 


et al., 2013; Bentele et al., 2013; Hao et al., 2011b). Codon usage 
will influence translation elongation (Tuller et al., 2010; 
Cannarrozzi et al., 2010; Klumpp et al., 2012). The internal 
Shine-Dalgarno like sequence may induce translation pausing (Li 
et al., 2012; Wen et al., 2008). 

In this work, we use stochastic models to describe the tran- 
scription and translation processes of a bicistronic operon driven 
by a constitutive or a two-state promoter (Fig. 1A) respectively. We 
get analytical results of intrinsic noise and correlation of gene 
expressions. We utilize termination efficiency of terminator to 
characterize the stiffness of polar effect, i.e., the ratio of tran- 
scripts that are prematurely terminated. Stochastic simulations 
confirm our analytical results. Our results show that the correla- 
tion between proteins decreases monotonously as termination 
efficiency increases and the correlation almost does not vary as the 
translation rate changes in the normal range. It may help to 
understand gene organization and its evolution in prokaryotes. 


2. The bicistronic operon with the constitutive promoter 


In our analysis, we focus on the statistics when the system is in 
the steady state. We use (q) and ôq? to denote the mean and 
variance of numbers of species q, the Fano factor v = 6q?/(q) and 
coefficient of variance 7 = \/5q? /(q) to characterize noise. The Fano 
factor can be a measurement for deviation from the Poisson 
behavior. It takes the value of 1 for a Poisson process. We use the 
Pearson correlation coefficient p= ((q;— < qı >)(d2— <q2>))/ 

qt ôq? to denote the correlation of the numbers of two species 
qi and q2. 
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Fig. 1. (A) Two kinds of promoters used in our study. (B) Gene expression process 
of the bicistronic operon with the constitutive promoter. T1 represents any kind of 
polar effect mechanism: Rho-dependent or Rho-independent terminator in inter- 
genic region of operon or other possible mechanisms (Giiell et al., 2009). The 
termination efficiency of T1 is e, the transcription rate of promoter is K, the 
translation rates of RFP and GFP are Kpr and Kpg, the degradation rates of R, RG, RFP 
and GFP are respectively yp, ypc, ype and ypc. For the two-state promoter, K should 
be replaced with k;x, and transcription happens only when the promoter is in the 
on state. 
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2.1. The model of the bicistronic operon with the constitutive 
promoter 


The state of a gene expression system can be specified by the 
numbers of mRNA molecules and proteins. For the stochasticity of 
gene expression, there are fluctuations in these numbers. We can 
describe this kind system by a joint distribution function 
£(1- q2, ---Gj---2 Ini ©, here q; is the number of i-th molecule, which 
will be mRNA or protein. We simply write f(q1, q2, ...qj---» qn; È) as 
fa» and iis ranging from 1 to n. For simplicity, we suppose that the 
birth and death rates of molecules in the system are linearly 
dependent on the numbers of different species. Then dynamic 
evolution of f,, is described in the scheme: 


kř (qj) 
— fyri 


fa 
kr (qj) 
—fa-1 


fa 


kř (q) = Aja; kr (4) = ET jq; 
J J 


where kř (q) and kp q) denote the birth and death rates of i-th 
molecule, matrices A and J’ provide the coefficients of their linear 
dependence on the molecular numbers. 

We study the system of a bicistronic operon as described in 
Fig. 1B. We use RFP and GFP to denote genes in the operon with 
RFP near the promoter. These two fluorescent proteins can be 
employed in an experiment to check our theoretical results. T1 is 
Rho-dependent or independent terminator or any other possible 
mechanism that would attenuate downstream gene transcription. 
T2 is the terminator that defines the end of the operon. Termina- 
tion efficiency of T1 is denoted as e, which also represents the 
stiffness of polar effect for the operon. For simplicity, we suppose 
that transcription and translation rates do not depend on each 
other. Although transcription and translation are coupled together 
in prokaryote, this assumption is still suitable for most of the cases. 

We first study the bicistronic operon with the constitutive 
promoter. Transcription start from the constitutive promoter at the 
rate K as shown in Fig. 1B. Two kinds of mRNA are transcribed 
from this operon. One is full-length mRNA with both RFP and GFP, 
which is denoted as RG. The other one is attenuated mRNA only 
with RFP, which is denoted as R. The synthesis rate of R and RG are 
Kr=eK and Kgg =(1—e)K. The translation rates of RFP and GFP 
are Kpr and Kpg, the degradation rates of R, RG, RFP and GFP are 
respectively yr, Ypcs Ypr and ypg- If we consider the growth dilution, 
the effective degradation rate Y = Y4egraa +H, Where Ydegraa is the 
degradation rate and y is the growth rate of the cell. In our system, 
qi = {D, R, RG, RFP, GFP}, where D is the number of DNA, i.e., the 
copy number of the operon. Matrices A and T for this bicistronic 
operon with the constitutive promoter are: 


0 0 000 00 0 0 0 
Krke 0 © 00 Ox 0 0 0 
A=|Kec 0 0 0 Ol, r=l0 0 yc 0 0 
O Ke Ke 0 0 0 0 O yk 0 
0 0 Ke 00 00 0 0 ye 


(G9) 


Transcription process of the bicistronic operon with the con- 
stitutive promoter includes a transcription initiation at rate K 
followed by a termination at the T1 terminator with a possibility of 
e (Fig. 1B). In our model, we used an equivalent description: two 
independent transcription processes of R and RG with the rates of 
Kr=eK and Krg=(1—e)K. It is easy to prove that these two 
descriptions give the same distributions of R and RG: two inde- 
pendent Poisson distributions. 


2.1.1. Method to get the means and variances. 

We use a similar approach as Thattai and van Oudenaarden 
(2001) in our theoretical derivations. The above scheme for our 
model can be described by the following Master Equation: 


fy =G'-1) (saa fe +(Ej*'-1) (zra) (2) 
J J 


where E is the step operator defined as EFC. qi.) = 
f(....q;+k,...). This equation can be solved using generating 
function method. First we denote generating function 
F(Z, t = Ya = 1,200] = 1,02 )fq,- In our system, I” is diagonal: 
Tij =ô; i, and then F obeys 


F= 2a —Zi) (a TAF) í (3) 
l J 


where F; = ðzF. In the steady state, F=0. 

The mean and variance of species will be given by following 
properties of generating function: Fl, = 1; Fjl (qj): Fa (a?) 
(qi); and if i#j, Fyly = qid; )» where Fij = ð; 0,,F and |; denotes 
evaluation at z;=1 for all j. Successive differentiations of Eq. (3) 
will generate linear equations of higher order moments. First we 
define the vector J; = F;, and the matrix Kj; = Fij, where the indexes 
of K mean row and column and the indexes of F mean differentia- 
tion. Differentiating Eq. (3) up to the second moment, we obtain 
the following equations for the steady state: 


(A-Ty=0, KA-DK+49= -(A-NDK+1L], (4) 


where Lj =AjjJ; (not summation) (Thattai and van Oudenaarden, 
2001). 

For our bicistronic operon model, J = ((D), (R), (RG), (RFP), (GFP)). 
By solving Eq. (4), we got the mean (J) and variance(K) of 
gene expression, and the correlation of different genes in the 
steady state. 


2.2. Results of the bicistronic operon with the constitutive promoter 


2.2.1. Statistics of mRNAs 

Both R and RG obey the Poisson distribution, which means their 
Fano factors are equal to 1. They are also independent with each 
other. The average numbers of R and RG are 


Kr ek 

Ranke’ 5 
wy YR YR “i 
eye 07K (6) 

YRG YRG 

Coefficient of variance of mRNAs are 

naw OR? Yr 1 7) 
RR) Vek Ry 

VRG? Tre 1 (8) 
e= RO ~ V-e)K 7 ARG 


From these two equations, we can see that lower level of mRNA 
expression leads to larger fluctuation. 


2.2.2. Statistics of proteins 

During the short lifetime of the mRNA, proteins are synthesized 
by ribosomes with the mRNA, so the profile of translation of 
protein is bursty as shown in Fig. 2B. This is one of the sources of 
the intrinsic noise. The number of protein produced from a single 
mRNA obeys the geometric distribution (Berg, 1978). We use b to 
denote the burstiness of the translation process. It is the average 
number of protein translated by ribosomes during the lifetime of 
one mRNA, for example, bpr-r = Kpr/yp is the average number of 
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Fig. 2. Sampling of gene expression profiles from stochastic simulation for the operon with the constitutive promoter. Small translation burst sizes in (A 
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correlation. In the simulations 7 pg = Ypg = 0.00029 s- !, 7p = ¥p¢ = 0.0033 s- 1, K = 0.0005 s- 1, e = 0, other parameters in (A) Kpr = Kpg = 0.0033 s-1, the average translational 


burst size of RFP and GFP are bpr = bpc = 1; (B) Kpr = Kpg 


RFP protein translated in the lifetime of one RG mRNA. Similarly, 
we denote bpr-rc = Kpr/Yp¢ and bpc = Kpc/Ygg. In the cell, protein 
lives much longer than MRNA, i.e., Ypr. Ypg < Yr- Yrc, SO We Made 
approximations under this assumption below. Average numbers of 
proteins are 


K Ker Kr K 
(REP) = RGR) +-(RG)) = RERO) 
YPR YPR YR RG 
K 
= —[ebpr-r + (1 — e)bpr-rc], (9) 
YPR 
Kpg Kpg KRG (1 = eK R 
(GFP) (RG) bpc. (10) 
YPG YPG YRG YPG = 
The ratio between them is 
(RFP) _ Kpr/Kpc aes l 1| aD 
(GFP) Ypr/¥pc L Yr/Yr L 


We can see different strategies tuning protein expression ratio in 
one operon: changing termination efficiency e, changing the 
translation rate ratio Kpr/Kpc, changing the protein or mRNA 
stability ratio, i.e., Ypp/Ypg, OF Yr/Yrcç- Although tuning protein 
ratio at protein degradation rates cannot be ruled out, it seems 
that the prokaryotic cell avoids to degrade protein to avoid 
wasting (Gur et al., 2011). For functional related proteins in the 
same operon, especially for those without transcriptional regula- 
tory function, mRNAs tend to have the similar lifetimes (Bernstein 
et al., 2002; Kristoffersen et al., 2012; Quax et al., 2013). It seems 
that adjusting translation rates or termination efficiency is widely 
used mechanism in the prokaryotic cell to tune the gene expres- 
sion ratio (Quax et al., 2013; Cho et al., 2009; Güell et al., 2009). 
The Fano factor of GFP is 


SGFP* Kpg 
(GFP) Ype +YRG 


which is the same as in Thattai and van Oudenaarden (2001). The 
distribution for the molecule numbers of protein is broader than 
the Possion distribution. 

Coefficient of Variance of GFP protein is 


1a Pc (12) 


YRG 


1 1 1 
"Tare ™ | (GP) "7, Tac (RG) 
PG 


0.066 s~! , bpr = bpc = 20. 


1 YPG 
ta —e)K 


(GFP) 
ra 1) l 


If we fix other parameters, the intrinsic noise of GFP is decreasing 
as the burst size bp increasing. When bpc is small, burst of 
translation is a critical source of intrinsic noise, when bpg is big, 
the intrinsic noise is mainly determined by the transcription 
process, i.e., by K and e, considering ypg is primarily determined 
by growth dilution. Increasing promoter strength (K) or decreasing 
premature termination (e) would repress mRNA expression noise 
as shown in Eq. (8), and then lower noise would be passed to the 
protein level. Our result agrees with Swain et al. (2002). 
When e 40 and e #1, the Fano factor of protein RFP is 


x 


_ YPG 1. 
SVTK (= a3) 


RFP? Kor l (RG) | = Ker l (R) |+ r 
(RFP) (Yre +Ypr) RG) +R)|  (YrR+YprR) LRG) +(R) 
1 1 
~ Dpr_rc e TRG + bpr-R EAA +1. a4 
Te yr ae a Tar 


Here we can see that the noise of RFP is derived from the Poisson 
processes of the translations from the two kinds of mRNAs. The 
ratio depends on their ratio in the total MRNA. When e-—1, then 


SREP” Kpr 
<RFP> (Yp+V7pr) ” 


and when e-—0, then 


SRFP* Ker 
<RFP> (Yp¢+Ypr) ` 


Under these two conditions, RFP protein is produced from only R 
or RG, so RFP has the similar distribution as GFP (Eq. (12)). 

When Ypg = Yp, Dpr-rc = brr-r = bpr. And then the Fano factor of 
RFP can be simplified into 


RFP? 
(RFP) 


= (1 —e)Dpr_rc+ ebpr_r+1 = bpr+1, (15) 
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the Fano factor only depends on the translational burst size of RFP, 
which is in agreement with the well-known result for single gene 
expression (Thattai and van Oudenaarden, 2001). 

Coefficient of variance of RFP protein is 


1 1 1 1 1 1 
"REP t + . 
(RFP) Yra \ | 44—© Rc Yr 1-—e yr | | (RG)+<(R) 
144% + 14 142 
( =) 1—e Yr ( a] te YRG 
(16) 
If Yk =Ypc, then 
1, 1 1 
“Tree = | (REP) | (14 ) (RG) + (R) 
YPR 
By 1 YPR 
~\ (REP)! K 
—  |Yer(_1 
~\«K (t1) a 


where bpr = bpr-r = bpr-rg. RFP can be translated equally from 
both R and RG, and these two kinds mRNAs have the same 
degradation rates as assumed. So termination efficiency (e) would 
not influence the noise of RFP. From Eqs. (13) and (17), clearly 
noise of RFP comes from: (1) noise inherited from the transcrip- 
tion process which is measured by transcription rate 1/K; (2) noise 
from burstiness of translation measured by burst size bpr. When 
transcription rate K or burst size bpr is small, the fluctuation of RFP 
expression would be significant. 


2.2.3. The correlation between RFP and GFP mRNAs 

In order to see how much correlation at the mRNA level would 
be passed to the protein level, we first analyze the correlation 
between numbers of mRNAs that contains RFP or GFP. 


(RGR) — (RG+R))(RG—(RG))) 


VARG+R)? VRG? 
1 


PmR-mG 


SR? 
SRG? 


_ (RG) 
~ V RGR 


=1/,/1+- £s, (18) 


1+ 


which is square root of the ratio of RG mRNA in the total MRNA. 

This equation shows that the correlation between mRNAs that 

contains RFP and GFP is from co-transcription of RFP and GFP. 

Longer lifetime of RG relative to R will lead to stronger correlation. 
If Yk = Yr, then 


PmR-mG = 1-e, (19) 


the correlation between RFP and GFP at the mRNA level is only 
dependent on the termination efficiency e of the terminator. 


2.2.4, The correlation between mRNA and protein 

We then calculate the correlation between mRNA and protein 
expressions. The Pearson correlation coefficient between RG and 
GFP is 


1 
PmG-GFP = ( l jl 


Yr V 
14 + ) (1 $ ) 
Kpc/Yr¢ Krc/Yp¢ 7 PG 


(20) 


where mG is RG, which is the only mRNA coding GFP protein. 
Proteins live much longer than mRNAs, which means 


1 : and YPC ~ 0, 


SS > ni 

Kpg/Yrq KrG/Ypr YRG 

then 

PRc-crp ~ I ~0. (21) 
Kpc/Yrc 


When 7pgg=7p and Yprp=Ypc, We got the correlation between 
mRNAs and RFP protein: 


1 X Yer/YR ~0 


P mR-RFP = (1+ I, 1 J0 te) 14 1 
Krr/Yr  Krr/Ypr Yor Kpr/Yp 


(22) 


where mR indicates the total number of transcripts containing RFP 
coding sequence, which include full-length transcripts RG and 
short ones R. From Eqs. (21) and (22), we see that there is almost 
no correlation between mRNA and protein due to 7 protein <YmrNa- 
Actually, experiments show that protein levels do not or just 
weakly correlate with mRNA levels (Taniguchi et al., 2010; Golding 
et al., 2005). Due to the much longer lifetime of protein, the 
fluctuation at the mRNA level is smoothed out at the protein level, 
which leads to the weak correlation between mRNA and protein. 


2.2.5. The correlation between RFP and GFP proteins 

Next we calculate gene expression correlation between two 
proteins, which has the biology significance for large protein 
complex formation, and for reaction systems sensitive to enzyme 
correlation (Ray and Igoshin, 2012). The Pearson correlation coeffi- 
cient of the numbers of RFP and GFP protein is 


(RFP - GFP) — (RFP)GFP) 
VRFP2\/5GFP? 
The general analytic formula is a little complex, and we do not show 
it here. In order to get intuitive understanding of the dependence of 
Prrp-crp ON different parameters, we make simplifications below. 


If YR = Yr» Ypr = Ypg then the correlation coefficient of RFP and 
GFP can be simplified into 


(23) 


PRFP-GEP 


1—e 


PRFP- 
REP-GFP ( | i F 1 (14 1 | 1 
 Ker/Yre  Kpr/Ypr Kpc/Yrq  Kec/Ypc 


1—e 


(e e) 


l-e 
VA+1/bpr) +1/bpc) 


where Dpr=bpr_r=Dpr_rc, fOr Yrg=Yr. Changing promoter 
strength (K) would not influence the correlation, although it can 
influence noise of mRNA and protein. When e=0, our result is the 
same as Ray and Igoshin (2012). In Eq. (24), the numerator /1—e 
represents protein correlation inherited from the mRNA level (see 
Eq. (19)) and the denominator shows the influence from the bursty 
translation process. If brrp or bgrp is small, then the number of RFP 
or GFP produced in the lifetime of one RG has large fluctuation 
(Eq. (17)). But translation events of RFP and GFP are independent 
and their fluctuations are unrelated, which means correlation be- 
tween RFP and GFP would be low. This has been shown by the sto- 
chastic simulations in Fig. 2A (simulation method see Appendix B). 
However, when both of bgrp and brrp are much larger than 1, 
fluctuations of RFP or GFP are both small (Eqs. (13) and (17)). So 


x 


(24) 
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Fig. 3. Dependence of the correlation between RFP and GFP on terminator efficiency e, translation burst size b, translation rates of proteins and the degradation rates of 
mRNAs in bicistronic operon with the constitutive promoter. (A-F) all data are obtained from the analytical results, lines in surface are contours of pprp_crp. (G) Cycles are got 
from stochastic simulations and lines are got from analytical results. In all figures ypg = ypg = 0.00029 s~!, other parameters in each subplot are specified as follows: in 
(A) ¢=0.3, 7p =yrc =0.0033 s-!; (B) yp =7p¢ =0.0033 s-!,Kpr =0.16s7!; (C) e=0.3,yg=0.0033 s~',Kpg=0.16s—'; (D) e=0.3,yp=0.0033 s~!, Kpr =0.16s—!; 
(E) e = 0.3, ypg = 0.0033 s~!, Kpc = 0.16 s~!; (F) e = 0.3, yg = 0.0033 s—!, Kpr = 0.16 s71; (G) K =0.3 s-!, Kpr = 0.16 s7 !, ypg = yr = 0.0033 s~!. 
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bursty translations have little effect on the correlation of two 
proteins, which has been shown by the simulation results in 
Fig. 2B. From Eq. (24) we can see that when both bpr and bpg are 
much bigger than 1 (Fig. 3A), Pprp_crp ~ V1—e, which means the 
correlation is mainly determined by the polar effect at the 
transcriptional level. 

We plot the relationship between /prp_crp and other para- 
meters in Fig. 3 using the formula of Pprp_crp (Eq. (23)) without 
any simplification except supposing Ypg = Ypg. In Fig. 3G, we can 
see stochastic simulations agree with our analytical results very 
well. In Fig. 3B and G we can see that Pprp_crp decreases as e 
increases, i.e., transcription termination at the internal terminator 
directly lowers the protein correlation. This result can explain why 
Monte Carlo simulation of Liang et al. (2013) only showed a small 
effect of intergenic terminator on the correlation. They take e=0.5, 
Prep-crep ~% V1—e 0.7 and the correlation is lowered just about 
30%. Taking bigger e, for example, e=0.9, Prrp-crp ~ V1—e © 0.32, 
then the polar effect will lower the correlation about 70%, and the 
influence of polar effect would be obvious. In prokaryotic cells, 
mRNAs live about several minutes on average (Selinger et al., 
2003; Bernstein et al., 2002; Kristoffersen et al., 2012). In the range 
of mRNA degradation rate (half-life of mRNA 0.5-15 min), adjust- 
ing the translation rate almost does not alter the protein correla- 
tion except translation rate is very small (Fig. 3B-G). This means 
adjusting protein expressions at the translation level will be the 
optimal strategy to tune the gene expression ratio in the same 
operon. If we take the half-life of mRNA 5 min, we can tune the 
translation rate between 0.01 and 1 s~! without obviously chan- 
ging the correlation of the proteins (Fig. 3G). Actually, comparative 
genomic study of Quax et al. (2013) shows that different transla- 
tion rate is a key determinant of the gene expression ratio in an 
operon. Only those RFPs translated from RG contribute to the 
correlation with GFP. Decreasing the stability of mRNA R (increas- 
ing yg) or increasing the stability of MRNA RG (decreasing 7p,) will 
increase the ratio of RFP correlated with GFP protein, which leads 
to a higher correlation (Fig. 3C-F). At the same time, increasing the 
stability of mRNA RG (decreasing 7p) will increase the burst size 
of correlated RFP and GFP translation from RG, which also leads to 
a higher correlation (Fig. 3C and D). 


3. The bicistronic operon with the two-state promoter 


Some single-cell experiments show that mRNA synthesis is also 
bursty for some promoters which can be described by the two- 
state model (Golding et al., 2005; So et al., 2011; Chong et al., 
2014). As shown in Fig. 1A, these promoters stochastically transit 
between ‘on’ and ‘off states. The mRNA can be synthesized with 
the transcription rate krx only when the promoter is in the on 
state. The transition rate from the off state to the on state is Kon, 
which determines the average duration time of off state (1/Kon). 
The transition rate from the on state to the off state is kg, which 
means the average duration of transcription pulse is 1/k gy. The 
average transcription burst size, i.e., the average number of mRNA 
synthesized in one period of on state is k7x /Ko. We study the gene 
expression noise in the bicistronic operon driven by the two-state 
promoter (Fig. 1). By solving the Master Equation, we get analytical 
results for the noises and the correlation at the mRNA level and 
the protein level. These analytical results are confirmed by 
stochastic simulation (procedure in Appendix B). 


3.1. The model of the bicistronic operon with the two-state promoter 


As in the system with the constitutive promoter, we use 
fa (= 1...n) to describe the system. Here the state of the promoter 
is indicated by qi, q4 = 1 for the on state and q, =0 for the off 


state. The molecular numbers of other species are indicated by qi 
(i=2...n), and q;=0,1...0c. We rewrite fy, as fq, 4, (i=2...n), the 
dynamic evolution of f,, 4, is described in the scheme: 


koff 
fi 0.4; & f 1,9; 
Kon 


kt q) 
11.4: Fara +1 


kr (qj) 
fan —fna-1 


kř (q) = ZAijq ki (q) = DL; 
j j 


(i=2...n, j= 1...n), 


where kř (q) and ki (qj) denote the birth and death rates of i-th 
molecule, matrices A and I’ provide the coefficients of their linear 
dependence on the molecular numbers. In our systems, qi= 
{D, R, RG, RFP, GFP}, where D is the number of active promoter, 
i.e., the number of the operon that can be transcribed. Matrices A 
and J’ are same as those we used for the operon with the 
constitutive promoter (Eq. (1)). 


3.1.1. Method to get the means and variances 
The above scheme for the two-state promoter can be described 
by the following Master Equation: 


faa = Er -D (zaa) aa tE -D (zr w) ani 
J J 
+( = 1)" (kop hg, —; Konf og, ). (25) 


For the promoter in the on or the off state we denote the 
generating functions: 


Fon(Zj, t)= > ( I 2 Va. (26) 
Gy 210,000 \y = 2,000 

Fz D= È ( II 2 foa (27) 
qu = 0,...00 \v = 2,...n 

and for the whole system the generating function is 


F(Zj, t) =2Z1Fon(Zj.0)+Fog(Z.t). In our system, I” is diagonal: 
I, = 6,I;, then F obeys 


FS 20x 
+ (Kop Fon —KonF of), (28) 


where Fj = ðz F. In the steady state, F = 0. 

The mean and variance of species will be given by following 
properties of generating function: Fj|; = (aj): Fi = (q?) — (qi); and 
if iAj, Fýlh = Fil = (aia; , where Fy=0,,0,F and |; denotes 
evaluation at zj = 1 for all j. Specially, F11|, = 0; and ifi41, then 
Fon ila +Fop ili = (qi), Fon ilı = Fail. Successive differentiations of 
Eq. (28) will generate linear equations of higher order moments. 

We denote fon = Fonl1 = Èq, =0....cof1g, Which is the possibility 
that promoter stays in the on state, and similarly fog =Fogli 
= Ya =0....cof 0.4, is the possibility of the off state. From the 
normalization of the possibility, we know that: 


Son +Fop = 1. (29) 


First we define the vector J; = F;, and the matrix Kj = Fij, where 
the indexes of K mean row and column and the indexes of F mean 
differentiation. Differentiating Eq. (28) up to the second moment, 
we obtain the following equations for the steady state: 


konf off = Kopf on =0, (30) 


F= > qd a(n > afi) +21 (KonFog — Kog Fon) 
n Beso lees 
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TiKin— È AjKj -An + Rog Ki — KonJi —K1i) = 90 (i= 2...n), 
J= Len 


(31) 


(A-T) =0, 


KA -DK +L; = -(A-DK +4; @j=2...m, GD 

where Lij = Ajj; (not summation). 

3.2. Results of the bicistronic operon with the two-state promoter 
By solving Eqs. (29)-(32) we get the noise and the correlation 


between molecules for the bicistronic operon with the two-state 
promoter. The results are shown below. 


3.2.1. Statistics of mRNAs 
The average number of mRNAs are 


(R) = Fonk (33) 
YR 
(RG) — a = OM onkrx (34) 


YRG 


where fon = Kon/(Kon +Kog) is the fraction of time or the possibility 
that the promoter is in the on state. If we denote K =f ,nkrx as 
the effective transcription rate of the two-state promoter, then 
the average number of R and RG take the same formula as 
Eqs. (5) and (6) for the constitutive promoter. 

The Fano factors of mRNAs are 


OR? 
-~= 1+ eA, 35 
®) R (35) 
5RG 
RG = TO Are, (36) 
where 

koff krx and Ages koff krx 


R — kon + kof Yr F Kon + kof) Kon + koff Yra + Kon + Kopp)” 


which characterize the influence of the two-state promoter. Our 
results of the Fano factors for the mRNAs are the same as previous 
results (Peccoud and Ycart, 1995). The Fano factors of R and RG are 
bigger than 1, which means the distribution of R and RG is not 
Poisson. Experimental measurements of MRNA show that the Fano 
factors of some MRNAs are greater than 1 (Golding et al., 2005). 
This implies that the two-state promoter model may be a more 
realistic description of these promoters (Golding et al., 2005; So 
et al., 2011; Chong et al., 2014). 
Coefficients of Variance of mRNAs are 


1 
B= Vm tee» (37) 


[1+(1 —e)Agc]. (38) 


VRC -\ 1 


kc = (RG) (RG) 


The noise of mRNA depends on the mean expression level: the 
lower the expression level, the higher the noise. This behavior is 
similar to the mRNA transcribed from the constitutive promoter 
(Eqs. (7) and (8)). However if we keep the same average expression 
level of mRNA, the operon with the two-state promoter has higher 
noise of mRNA compared with the constitutive promoter. If we 
adjust the parameters of the two-state promoter and let 
Ar, Arc— 0, then the Fano factors and coefficients of variance of 
the mRNAs will be the same as those for the operon with the 
constitutive promoter (Eqs. (5)-(8)). 


3.2.2. Statistics of proteins 
The average expression levels of proteins are 


K, k 

(REP) = KRUR) + RGY = 12 ebre r+ (1 — e)brr-rcl. (39) 
YPR YPR 
Kpg a =f onka e, (40) 


ee YPG a YPG 
where Dpr_rg = kKrr/Yrg» Der-r=kpr/Yp and bpg_rc=kpc/Yp¢ are 
the translational burst sizes of RFP and GFP. If we use the effective 
transcription rate K =f nkrx in Eqs. (39) and (40), the average 
expressions of the RFP and GFP for the two-state promoters 
take the same formula as those for the constitutive promoter 
(Eqs. (9) and (10)). 

When ypgg = Yr, then Dpr_rc = brr-r = bpr and the Fano factors 
of GFP and RFP are 


SGFP* Kig 


= 1+(1-—e)Bc]+1 © bpoll +(1-—e)Bgl+1, 41 
(GEP) Yoc+ Yra. ( Be] pcll +( )Bc] (41) 
RFP? Kpr 

= 1+Bp)+1 ~ bpr +Bp) +1, 42 
(RFP) Vout Yea R) PR( R) (42) 
where 

koff krx(Y rG + Ypg + Kon + Kopp) 


= Kon + koff (Yra + Kon + Kop )(Y pG + Kon + koff) 
and 


_ koff krx(Yrg +Y pr + Kon + Koff) 
R kon + kop (Yra + Kon + koff \Y pr + Kon + kop) 


We can see that the bigger the translational burst size, the bigger 
the Fano factors. The Fano factors of GFP and RFP (Eqs. (41) 
and (42)) for the two-state promoter are bigger than those for 
the constitutive promoter (Eqs. (12) and (14)), and the influence of 
the two-state promoter is characterized by Bg and Br. 

When ypg = 7p, the coefficients of variance of GFP and RFP are 


al kpg 1 1 
= H 1 B 
lce , | (GFP) ( is Yre t A 14 (RG) ne 


YPG 
~ 1 tieid E ii 
(GFP) ad —e)f onkrx 
YPG 1 | 
= +1+(1 Bgl, 43 
TE [s Te ea 
Nre ~ 4] L ETEB (44) 
RFP fonKkrx bpp : 


We can see that noises of RFP and GFP are bigger than those for 
the constitutive promoter (see Eqs. (13) and (17)), and the 
influence the two-state promoter is characterized by Br and Be. 
The bigger the translational burst size, the smaller the noise of the 
proteins, which is the same as the operon with the constitutive 
promoter. 


3.2.3. The correlation between mRNAs and the correlation between 
proteins 

For the operon with the constitutive promoter, both synthesis 
and degradation of RG and R are independent with each other, so 
numbers of RG and R are uncorrelated. However, for the operon 
with the two-state promoter, mRNAs RG and R are synthesized 
only when the promoter is in the on state, and both mRNAs are 
just under degradation in the off state. This will bring correlation 
between numbers of RG and R, which is clearly shown in the 
stochastic simulation (Fig. 4B). The correlation between RG and R 
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depends on the dynamics parameters of the two-state promoter: 
Kon, Kog and krx. We will get the analytical result for the correlation 
and discuss the influences of these parameters. 

When ypg = Yp, then Ar = Arg =A, we will get 


PRG-R (45) 
y (1+ 
P mR-mG = (46) 
where 
koff krx 


kon + Koff (Yra + Kon + Kopp) 


Unlike the operon with the constitutive promoter where 
Pro_rp=9, MRNAs RG and R transcribed from a two-state 
promoter have nontrivial correlation which is dependent on e 
and A. If we adjust the dynamics parameters of the two-state 
promoters, the correlation will increase as A become larger. The 
first product term of A is Sopp = Kopf Kon + kopf) which is the 
possibility that promoter is in the off state. The effects of for 
on the correlation are shown in Fig. 4A and B. If the promoter 
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stays in the on state too long, the correlation between RG and R 
will be disturbed by the independent synthesis of RG and R 
(Fig. 4A). The second product term is krx/(Yrg+ Kon + Kop). It 
quantifies the influence of the transcriptional burst size 
(krx/Kkoşp) on the correlation. If the transcriptional burst size is 
very small, the possibility that both RG and R are synthesized in 
a period of on state will be small. This leads to small correlation 
between the numbers of RG and R (Fig. 4C, both the burst sizes 
of RG and R are 0.15). Taken together all of these results, we get 
the conclusion that the more time promoter stays in the off state 
and the bigger transcriptional burst size, the bigger the correla- 
tion between mRNAs. When A*}1, Pmr-mg>1, which means 
promoter dynamics strongly influences the correlation. 

When Ypg = 7p and Ypg = Ypg then Br = Bg =B, the correlation 
between proteins is 


Vv1-e 
PRFP-GFP i a 
Kog (RG + Yc) p Yrct7rc) e 
+ 14-2 
1+B 1+B 141 
x 1—e (47) 
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Fig. 4. Sampling of gene expression profiles from stochastic simulation for the operon with the two-state promoter. Promoter stays in the on state for long time (A) or small 
transcription burst size (C) would lower the correlation between two mRNAs. In the simulations yp = ypc = 0.0033 s~!,e = 0.5, kon = 0.006 s~ !, krx = 0.3 s~!; other 
parameters in (A) ką = 0.00075 gal. fog © 0.11, the average transcriptional burst sizes of RG and R are bgc = br = 200; (B) kp =0.006 s- A fop =0.5, bre = br = 25; 


(C) kaf =1 57}, fog © 1, bre = br =0.15. 
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When Yge > Yro» 


a Kopf krx 
Kon + koff (Ypg + Kon + Kor) 


We can see the correlation between proteins is bigger than in the 
operon with the constitutive promoter (Eq. (24)). Changing 
dynamics parameters of the two-state promoters could change 
the correlation between proteins. If we adjust the parameters of 
two-state promoter and let BO, correlation between proteins 
will be the same for the constitutive promoter. 


3.2.4. Effects of dynamics parameters on the correlation 

Among three dynamics parameters of the two-state promoter, 
E. coli cell seems to adjust kof to modulate the transcription 
strength (So et al., 2011). We first investigate how the correlation 
depends on ky. Both the stochastic simulation and analytical 
results are shown in Fig. 5A and B. They agree with each other very 
well. As we discussed above, the two-state promoter would 
increase the correlation between proteins compared with the 
constitutive promoter. When koff decreases to zero (Fig. 4A), the 
promoter dynamics will be the same as the constitutive promoter, 
and the promoter will not be turned off. When kog > Kon (Fig. 4C), 
the promoter stays in the on state only for a very short time and 
the transcriptional burst sizes are very small. So the possibility 
that both RG and R are synthesized in same period of on state will 
be small, and the correlation between RG and R will be lost. 
Between these two limits, the correlation would increase first and 
then decrease as ko increase further (Fig. 5A and B). The exact 
theoretical kog at which Pprp_crp gets the maximal value is a little 
complicated for general case. But from Eq. (47), we can see that 
Prep-crp Zet its maximal value when B get the maximal value for 
the case of yg = Ypg and Ypp = Ypg: When Ypg > Ypg 


Z koff krx 
kon + koff (Ypg+ Kon+ Koff) 


will get the maximal value at kop = \/Kon(Kon +Ypc), which means 
the correlation between RFP and GFP will get the maximal value at 
this point. 

We also study the correlation between proteins with different 
dynamics parameters while keeping the average expression level 
constant. When we adjust kop or kon, we adjust krx accordingly to 
keep the effective transcription rate K = konkrx/(Kon + Kop) unchan- 
ged. Analytical results and stochastic simulations for the correla- 
tion between proteins are shown in Fig. 5C-F. If we adjust the 
dynamics parameters of transcription, the correlation between the 
numbers of mRNAs coding for RFP and GFP will change, and the 
correlation between proteins varies accordingly. In the case of 
fixing kon and increasing kg, the duration time of one transcription 
burst decreases, and the time that the promoter stays in the off 
state are not changed. The average numbers of mRNAs R and RG 
are unchanged, but these mRNAs are synthesized in a series of 
shorter time windows and in other times are just degraded. This 
will lead to larger correlation between mRNAs, which will trans- 
mit to proteins level (Fig. 5C and D). Based on similar reason, fixing 
koş and increasing kon will decrease the correlation between the 
number of mRNAs coding for RFP and GFP, and then decrease the 
correlation between proteins (Fig. 5E and F). 

In Fig. 5A, C and E, a variation of terminator efficiency does not 
significantly change the correlation between proteins in some 
regimes. This is because the correlation between R and RG is 
significant in these regimes (see Eq. (46)), so the ratio of RG in the 
total mRNA (determined by e) would not obviously influence the 
correlation between proteins. This phenomenon is one of the main 
differences between these two kinds of promoters. Outside of 
these regimes, the polar effect will affect the correlation between 
proteins obviously: the larger the termination efficiency, the 


smaller the correlation (see Fig. 5A, C, and E). However, changing 
the translation rate in the normal range (0.01-1 s~') almost does 
not vary the correlation (see Fig. 5B, D, and F). The dependence is 
obvious only when the translation rate is very small (below 
0.001 s~!) which is quite rare for protein synthesis in bacteria. 
So adjusting translation rate is still a better strategy than adjusting 
termination efficiency to tune expression ratio of genes in the 
operon. This is the same with the constitutive promoter. 


4. Discussion 


In the post-genome era, understanding the relationship bet- 
ween gene sequences, genes organization and biological functions 
is a great challenge. In this work, we study noise of gene 
expression in the bicistronic operon with polar effect. We try to 
figure out how different parameters affect the correlation of these 
two gene expressions. Our result may be helpful to understand 
which mechanism cell chooses to coordinate gene expression 
levels and why. 

Previous experiments have shown that the expression correla- 
tion between two genes in the same operon is stronger than those 
in two separate operons even with identical promoters (Ray and 
Igoshin, 2012; Tabor et al., 2008). This strengthened correlation 
comes from co-transcription of the two genes (Sneppen et al., 
2010; Swain, 2004; Ray and Igoshin, 2012; Iber, 2006). It makes 
sense that intergenic terminator would lower the correlation. If 
co-translation is not considered, translation processes of different 
genes in the same operon are independent with each other. This 
may lead the correlation of gene expressions to be further reduced 
at the protein level. 

For the large protein complex synthesis, subunits need to be 
produced in a determined ratio to avoid wasting and to increase 
the efficiency (Carmi et al., 2006, 2009; Swain, 2004). As shown in 
Figs. 3 and 5, if the cell tunes the subunit expression level in 
transcription level using terminator in the intergenic region, the 
correlation of these two proteins would be lowered, which is not 
favored by the cell. However, altering the translation rate almost 
does not alter the correlation in a large parameter space 
(Figs. 3 and 5). So tuning the protein levels with the translation 
rate would be a better strategy. Actually, recent work by Quax et al. 
(2013) shows that different translation rates are a key factor 
determining expression levels of genes in an operon. Correlative 
expressions of enzymes in the same metabolic pathway will avoid 
intermediate cumulation which may be harmful. Thus putting 
these enzyme genes in an operon without obvious polar effect 
would benefit cell growth (Ray and Igoshin, 2012). 

Using bioinformatic methods, it was found that there are a lot 
of terminators in the intergenic region of the operons (Lesnik et al., 
2001). It is practical that a cell may tune the expression ratio at the 
transcriptional level for those genes in an operon by inserting 
terminators between genes. So there is a balance between the 
advantage of co-regulation in the same operon and the detrimen- 
tal effect of lowered correlation by the polar effect. For those 
circumstances that function are not sensitive with the correlation, 
operon with intergenic terminator would be one of the choices to 
tune the expression ratio. For example, recA and recX genes form 
an operon with an intergenic terminator in E. coli (Pagés et al., 
2003). RecA and RecX are two proteins with the opposite functions 
in the process of repairing of double strand DNA breaking. Normal 
cell function needs both of them (Stohl et al., 2003). RecA forms 
poly-RecA to repair DSB (double strand breaking), and RecX can 
inhibit function of RecA as the ratio of RecX to RecA varies in a 
broad range (Stohl et al., 2003). So low correlation between them 
would have little impact on their function. It is no wonder why a 
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Fig. 5. Effects of polar effect and translation rate on the correlation between RFP and GFP expressed from a bicistronic operon driven by a two-state promoter. Lines are the 
analytical results and symbols are from stochastic simulations. (A,B)(C,D)(E,F) are correspondent with three ways of modulation of the two state promoter. In (A,B) Kon, Krx are 
fixed, only kop is changed; in (C,D) kon and the average MRNA expression levels are fixed, kop is changed and kry is changed correspondingly; in (E,F) kag and the average 
mRNA expression levels are fixed, kon is changed and krx is changed correspondingly. Parameters used are as follows: ypr = ypc = 0.00029 s~!, yp = ypc = 0.0033 s~!, 
Kpr =0.16 s71, in (A,C,E) kpc = 0.16 s~!; (B,D,F) e=0.7; (A,B) kon = 0.006 s~ 1, krx = 0.3 s~1; (CD) kon = 0.006 s~1; (E,F) ko = 0.006 s-t. 


cell utilizing a terminator with e ~ 0.9 maintains a relatively lower 
level of RecX with RecA without affecting their functions. 
Changing mRNA stability can tune the protein expression ratio, 
but it will also change the protein correlation significantly. Increas- 
ing lifetime of full-length mRNA will enhance the correlation, and 


increasing lifetime of the short one gets the opposite effect. It is 
because a longer lifetime of full-length mRNA would lead to a 
bigger translation burst, and a larger ratio of protein produced from 
the full-length mRNA, both of which can increase the correl- 
ation. Increasing the lifetime of short mRNA just lowers the ratio 
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of uncorrelated RFP from total RFP protein. However, the cell seems 
to avoid tuning the mRNA stability for different parts of one operon 
when the proteins they code have related function (Bernstein 
et al., 2002; Kristoffersen et al., 2012; Quax et al., 2013). 

Intrinsic and extrinsic noises together determine gene expres- 
sion profiles (Taniguchi et al., 2010; Rosenfeld et al., 2005; 
Paulsson, 2004). In this work, we just consider the former. The 
intrinsic noise is the main source of noise when the average 
number of protein is low (Taniguchi et al., 2010; Elowitz et al., 
2002). For a single cell, extrinsic noise varies relatively slowly. 
Experiments show it would introduce correlation for independent 
expressed proteins in the time of 20min to one cell-cycle 
(Taniguchi et al., 2010; Rosenfeld et al., 2005). However, 20 min 
is a long time for fast growing cells, so our intrinsic results would 
not lose the sense. 

Genes work as a network. So to understand the genome design 
principle we need to consider interactions between genes, includ- 
ing the influence from the stochasticity of gene expression. Our 
analytical results can give direct estimation for the relation of 
noise and correlation of gene expression, which may help to 
understand the relationship between genome structure and robust 
gene interaction network. Recent development of single cell or 
molecule experiment offers quantitative measurement of gene 
expression in time and space dimension (Taniguchi et al., 2010; So 
et al., 2011; Nevo-Dinur et al., 2011). Our results may help to 
design and understand this kind of experiments. 
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Appendix A. Parameter estimation for gene expression in 
E. coli 


1. The mRNA degradation rate: about 80% of mRNAs have a half-life 
time 3-8 min (Bernstein et al., 2002). We use Pmrya = 1/5 
min~! ~ 0.0033 s~! by default, and vary mRNAs half-life just 
in the range of 0.5-15 min when we discuss the parameter 
dependence. 

2. The protein degradation rate: protein is only under growth 
dilution in bacteria, almost no degradation, so protein = 10 
2/T = 0.00029 s~', T is the doubling time of bacterial cell, we 
choose T=40 min. 

3. The translation rate: considering the limit for translation elon- 
gation and the time needed for translation initiation (about 1 s, 
Proshkin et al., 2010), the fastest translation rate should be 
around 1s~!, the smallest we use 0.0011 s~! (average one 
translation happens from the most stable mRNA during its 
15 min half-life time). The average translation rate is in the 
range of 0.1-0.3 s~' (Bremer and Dennis, 1996). For E. coli with 
doubling time T=40 min, the average translation rate is around 
0.16 s~' which we used as default (Bremer and Dennis, 1996). 

4. The transcription rate: the highest estimation 1 s7! is from rrn 
genes transcription for the fast growing E. coli cell (Bremer and 
Dennis, 1996). 


5. The two-state promoter parameters: from Fig. S12C and D of So 
et al. (2011), we use kon =0.006s~!, kon =0.006s~!, kx = 
0.3 s7! as default. 


Appendix B. Stochastic simulation method 


The Gillespie (1977) algorithm, developed by Daniel T. Gillespie 
in 1970s, is well known as an effective stochastic simulation 
algorithm. By calculating the reaction probability density function 
(RPDF) P(t, p) using the current reaction rate and reactant species 
combination number, probability of when the next reaction will 
occur (T) and which reaction will occur (#4) can be provided. Then 
we renew the molecular number of relevant species. In this way, 
we get a Statistically correct evolution trajectory of the chemical 
system, which is rigorously equivalent to that produced by the 
distribution function f(qy,...,q,;t) as the solution of the master 
equation. We use Gillespie algorithm (Gillespie, 1977; Hao et al., 
2011a) to simulate the biochemical reactions shown in Fig. 1. 
Parameters used in the simulations are shown in Appendix A. We 
first generate a long enough stochastic evolution trajectory for 
each chemical species. With these trajectories, we start the 
sampling after the system reaches the steady state (the waiting 
time we used is 10 protein half-lives). In order to make sure that 
the statistic points are fully independent of each other, we select 
the points in the trajectories with the interval of 2 protein half- 
lives time, and then calculate the correlation between different 
species. Each data point of the simulation results is the result of at 
least 10,000 trials. 
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